cavity method
Evidence of Replica Symmetry Breaking under the Nishimori conditions in epidemic inference on graphs
Braunstein, Alfredo, Budzynski, Louise, Mariani, Matteo, Ricci-Tersenghi, Federico
In Bayesian inference, computing the posterior distribution from the data is typically a non-trivial problem, which usually requires approximations such as mean-field approaches or numerical methods, like the Monte Carlo Markov Chain. Being a high-dimensional distribution over a set of correlated variables, the posterior distribution can undergo the notorious replica symmetry breaking transition. When it happens, several mean-field methods and virtually every Monte Carlo scheme can not provide a reasonable approximation to the posterior and its marginals. Replica symmetry is believed to be guaranteed whenever the data is generated with known prior and likelihood distributions, namely under the so-called Nishimori conditions. In this paper, we break this belief, by providing a counter-example showing that, under the Nishimori conditions, replica symmetry breaking arises. Introducing a simple, geometrical model that can be thought of as a patient zero retrieval problem in a highly infectious regime of the epidemic Susceptible-Infectious model, we show that under the Nishimori conditions, there is evidence of replica symmetry breaking. We achieve this result by computing the instability of the replica symmetric cavity method toward the one step replica symmetry broken phase. The origin of this phenomenon -- replica symmetry breaking under the Nishimori conditions -- is likely due to the correlated disorder appearing in the epidemic models.
Contextual Stochastic Block Models
Deshpande, Yash, Sen, Subhabrata, Montanari, Andrea, Mossel, Elchanan
We provide the first information theoretical tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretic necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model.
Contextual Stochastic Block Models
Deshpande, Yash, Sen, Subhabrata, Montanari, Andrea, Mossel, Elchanan
We provide the first information theoretical tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretic necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model.
Detection limits in the high-dimensional spiked rectangular model
Alaoui, Ahmed El, Jordan, Michael I.
We study the problem of detecting the presence of a single unknown spike in a rectangular data matrix, in a high-dimensional regime where the spike has fixed strength and the aspect ratio of the matrix converges to a finite limit. This setup includes Johnstone's spiked covariance model. We analyze the likelihood ratio of the spiked model against an "all noise" null model of reference, and show it has asymptotically Gaussian fluctuations in a region below---but in general not up to---the so-called BBP threshold from random matrix theory. Our result parallels earlier findings of Onatski et al.\ (2013) and Johnstone-Onatski (2015) for spherical spikes. We present a probabilistic approach capable of treating generic product priors. In particular, sparsity in the spike is allowed. Our approach is based on Talagrand's interpretation of the cavity method from spin-glass theory. The question of the maximal parameter region where asymptotic normality is expected to hold is left open. This region is shaped by the prior in a non-trivial way. We conjecture that this is the entire paramagnetic phase of an associated spin-glass model, and is defined by the vanishing of the replica-symmetric solution of Lesieur et al.\ (2015).
A statistical physics approach to learning curves for the Inverse Ising problem
Bachschmid-Romano, Ludovica, Opper, Manfred
Using methods of statistical physics, we analyse the error of learning couplings in large Ising models from independent data (the inverse Ising problem). We concentrate on learning based on local cost functions, such as the pseudo-likelihood method for which the couplings are inferred independently for each spin. Assuming that the data are generated from a true Ising model, we compute the reconstruction error of the couplings using a combination of the replica method with the cavity approach for densely connected systems. We show that an explicit estimator based on a quadratic cost function achieves minimal reconstruction error, but requires the length of the true coupling vector as prior knowledge. A simple mean field estimator of the couplings which does not need such knowledge is asymptotically optimal, i.e. when the number of observations is much large than the number of spins. Comparison of the theory with numerical simulations shows excellent agreement for data generated from two models with random couplings in the high temperature region: a model with independent couplings (Sherrington-Kirkpatrick model), and a model where the matrix of couplings has a Wishart distribution.
Finding One Community in a Sparse Graph
We consider a random sparse graph with bounded average degree, in which a subset of vertices has higher connectivity than the background. In particular, the average degree inside this subset of vertices is larger than outside (but still bounded). Given a realization of such graph, we aim at identifying the hidden subset of vertices. This can be regarded as a model for the problem of finding a tightly knitted community in a social network, or a cluster in a relational dataset. In this paper we present two sets of contributions: $(i)$ We use the cavity method from spin glass theory to derive an exact phase diagram for the reconstruction problem. In particular, as the difference in edge probability increases, the problem undergoes two phase transitions, a static phase transition and a dynamic one. $(ii)$ We establish rigorous bounds on the dynamic phase transition and prove that, above a certain threshold, a local algorithm (belief propagation) correctly identify most of the hidden set. Below the same threshold \emph{no local algorithm} can achieve this goal. However, in this regime the subset can be identified by exhaustive search. For small hidden sets and large average degree, the phase transition for local algorithms takes an intriguingly simple form. Local algorithms succeed with high probability for ${\rm deg}_{\rm in} - {\rm deg}_{\rm out} > \sqrt{{\rm deg}_{\rm out}/e}$ and fail for ${\rm deg}_{\rm in} - {\rm deg}_{\rm out} < \sqrt{{\rm deg}_{\rm out}/e}$ (with ${\rm deg}_{\rm in}$, ${\rm deg}_{\rm out}$ the average degrees inside and outside the community). We argue that spectral algorithms are also ineffective in the latter regime. It is an open problem whether any polynomial time algorithms might succeed for ${\rm deg}_{\rm in} - {\rm deg}_{\rm out} < \sqrt{{\rm deg}_{\rm out}/e}$.
Phase Transitions in Community Detection: A Solvable Toy Model
Steeg, Greg Ver, Moore, Cristopher, Galstyan, Aram, Allahverdyan, Armen E.
Recently, it was shown that there is a phase transition in the community detection problem. This transition was first computed using the cavity method, and has been proved rigorously in the case of $q=2$ groups. However, analytic calculations using the cavity method are challenging since they require us to understand probability distributions of messages. We study analogous transitions in so-called "zero-temperature inference" model, where this distribution is supported only on the most-likely messages. Furthermore, whenever several messages are equally likely, we break the tie by choosing among them with equal probability. While the resulting analysis does not give the correct values of the thresholds, it does reproduce some of the qualitative features of the system. It predicts a first-order detectability transition whenever $q > 2$, while the finite-temperature cavity method shows that this is the case only when $q > 4$. It also has a regime analogous to the "hard but detectable" phase, where the community structure can be partially recovered, but only when the initial messages are sufficiently accurate. Finally, we study a semisupervised setting where we are given the correct labels for a fraction $\rho$ of the nodes. For $q > 2$, we find a regime where the accuracy jumps discontinuously at a critical value of $\rho$.
Balanced K-SAT and Biased random K-SAT on trees
Sumedha, null, Krishnamurthy, Supriya, Sahoo, Sharmistha
We study and solve some variations of the random K-satisfiability problem - balanced K-SAT and biased random K-SAT - on a regular tree, using techniques we have developed earlier(arXiv:1110.2065). In both these problems, as well as variations of these that we have looked at, we find that the SAT-UNSAT transition obtained on the Bethe lattice matches the exact threshold for the same model on a random graph for K=2 and is very close to the numerical value obtained for K=3. For higher K it deviates from the numerical estimates of the solvability threshold on random graphs, but is very close to the dynamical 1-RSB threshold as obtained from the first non-trivial fixed point of the survey propagation algorithm.
Exact learning curves for Gaussian process regression on large random graphs
We study learning curves for Gaussian process regression which characterise performance interms of the Bayes error averaged over datasets of a given size. Whilst learning curves are in general very difficult to calculate we show that for discrete input domains, where similarity between input points is characterised in terms of a graph, accurate predictions can be obtained. These should in fact become exact for large graphs drawn from a broad range of random graph ensembles with arbitrary degree distributions where each input (node) is connected only to a finite number of others. Our approach is based on translating the appropriate belief propagation equations to the graph ensemble. We demonstrate the accuracy of the predictions for Poisson (Erdos-Renyi) and regular random graphs, and discuss when and why previous approximations of the learning curve fail.